Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Triton benchmarks for blog #509

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

rishic3
Copy link
Collaborator

@rishic3 rishic3 commented Mar 11, 2025

Scripts, configs, and instructions to reproduce the blog benchmarks, displaying the benefit of using Triton for CPU parallelism..

@rishic3 rishic3 requested a review from eordentlich March 11, 2025 22:13
@rishic3 rishic3 force-pushed the triton-benchmarks branch from 7c26cc0 to 2614369 Compare March 12, 2025 01:59
1. [`spark_resnet.py`](spark_resnet.py): Uses predict_batch_udf to perform in-process prediction on the GPU.
2. [`spark_resnet_triton.py`](spark_resnet_triton.py): Uses predict_batch_udf to send inference requests to Triton, which performs inference on the GPU.

Spark cannot change the task parallelism within a stage based on the resources required (i.e., multiple CPUs for preprocessing vs. single GPU for inference). Therefore, implementation (1) will limit to 1 task per GPU to enable one instance of the model on the GPU. In contrast, implementation (2) allows as many tasks to run in parallel as cores on the executor, since Triton handles inference on the GPU.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For resnet-50 could multiple model instances fit in the GPU? If so, might be good to benchmark that case, where multiple spark tasks run per GPU with each having its own model instance. Due to multiple processes, GPU compute will be time sliced so perf could be hit, but still interesting to compare.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to consolidate this script with spark_resnet.py and select library or triton via cli argument?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants